Skip to content

Conversation

@dor-forer
Copy link
Collaborator

@dor-forer dor-forer commented Dec 28, 2025

Describe the changes in the pull request

Added support to SQ8 to SQ8 Ip and Cosine spaces.
Intel:
AVX512_F_BW_VL_VNNI - needs it to support uint8 and in8 operations.
ARM:
SVE
NEON

The cosine functions assume that the vectors are normlized therefore don't divide by the norm.
Which issues this PR fixes

  1. MOD-13170

Main objects this PR modified

  1. ...
  2. ...

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

- Implemented inner product and cosine distance functions for SQ8-to-SQ8 vectors in SVE, NEON, and AVX512 architectures.
- Added corresponding distance function selection logic in IP_space.cpp and function headers in IP_space.h.
- Created benchmarks for SQ8-to-SQ8 distance functions to evaluate performance across different architectures.
- Developed unit tests to validate the correctness of the new distance functions against expected results.
- Ensured compatibility with existing optimization features for various CPU architectures.
@dor-forer dor-forer requested a review from Copilot December 28, 2025 09:39
@dor-forer dor-forer changed the title Sq8 to Sq8 dist functions - ip and cosine Sq8 to Sq8 dist functions - ip and cosine [MOD-13170] Dec 28, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for SQ8-to-SQ8 distance functions for Inner Product and Cosine similarity metrics, where both vectors are scalar-quantized 8-bit representations. This extends the existing SQ8 functionality which only handled float-to-SQ8 comparisons.

Key changes:

  • Implemented optimized SIMD versions for Intel (AVX512) and ARM (SVE, SVE2, NEON) architectures
  • Added comprehensive test coverage for all optimization variants
  • Integrated new benchmark suite for performance testing
  • Refactored existing helper function names to avoid naming conflicts

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/test_spaces.cpp Added unit tests for SQ8_SQ8 IP and Cosine functions across all architectures; improved test suite naming consistency
tests/benchmark/spaces_benchmarks/bm_spaces_sq8_sq8.cpp New benchmark file for SQ8_SQ8 distance functions
tests/benchmark/spaces_benchmarks/bm_spaces_sq8.cpp Removed empty comment lines (cleanup)
tests/benchmark/benchmarks.sh Added spaces_sq8_sq8 to benchmark configurations
tests/benchmark/CMakeLists.txt Added sq8_sq8 data type to build system
src/VecSim/spaces/functions/SVE2.h Added function declarations for SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/functions/SVE2.cpp Implemented SVE2 function choosers for SQ8_SQ8 distance functions
src/VecSim/spaces/functions/SVE.h Added function declarations for SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/functions/SVE.cpp Implemented SVE function choosers for SQ8_SQ8 distance functions
src/VecSim/spaces/functions/NEON.h Added function declarations for SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/functions/NEON.cpp Implemented NEON function choosers for SQ8_SQ8 distance functions
src/VecSim/spaces/functions/AVX512F_BW_VL_VNNI.h Added function declarations for SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/functions/AVX512F_BW_VL_VNNI.cpp Implemented AVX512 function choosers for SQ8_SQ8 distance functions
src/VecSim/spaces/IP_space.h Added public API declarations for SQ8_SQ8 distance functions
src/VecSim/spaces/IP_space.cpp Implemented dispatcher functions that select appropriate SIMD implementation
src/VecSim/spaces/IP/IP_SVE_SQ8_SQ8.h New file with SVE-optimized SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/IP/IP_NEON_SQ8_SQ8.h New file with NEON-optimized SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/IP/IP_AVX512F_SQ8_SQ8_BW_VL_VNNI.h New file with AVX512-optimized SQ8_SQ8 IP and Cosine implementations
src/VecSim/spaces/IP/IP_AVX512F_SQ8_BW_VL_VNNI.h Renamed helper function to avoid naming conflicts with new SQ8_SQ8 implementations
src/VecSim/spaces/IP/IP_AVX2_SQ8.h Renamed helper function to avoid naming conflicts and fixed cosine normalization
src/VecSim/spaces/IP/IP.h Added function declarations for SQ8_SQ8 base implementations
src/VecSim/spaces/IP/IP.cpp Implemented baseline SQ8_SQ8 InnerProduct and Cosine functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Dec 28, 2025

Codecov Report

❌ Patch coverage is 97.10145% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.02%. Comparing base (cd722bf) to head (2e57cf2).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/VecSim/spaces/IP_space.cpp 87.50% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #873      +/-   ##
==========================================
- Coverage   97.03%   97.02%   -0.01%     
==========================================
  Files         126      127       +1     
  Lines        7455     7506      +51     
==========================================
+ Hits         7234     7283      +49     
- Misses        221      223       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 12 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@dor-forer dor-forer requested a review from meiravgri December 28, 2025 14:28
@dor-forer dor-forer requested a review from meiravgri January 4, 2026 07:14
Copy link
Collaborator

@meiravgri meiravgri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please revert changes in files that are not related to this pr (all IP_*_SQ8.h files changes)

  2. consider comparing IP_*UINT8.h optimized implmntation function to the new SQ8_SQ8 to ensure we get the best optimization

const auto *pVect1 = static_cast<const uint8_t *>(pVect1v);
const auto *pVect2 = static_cast<const uint8_t *>(pVect2v);

// Extract metadata from the end of vectors (likely already prefetched)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean "(likely already prefetched)"?

Comment on lines 2238 to 2273
float baseline = SQ8_Cosine(v1_orig.data(), v2_quantized.data(), dim);

#ifdef OPT_SVE2
if (optimization.sve2) {
unsigned char alignment = 0;
arch_opt_func = Cosine_SQ8_GetDistFunc(dim, &alignment, &optimization);
ASSERT_EQ(arch_opt_func, Choose_SQ8_Cosine_implementation_SVE2(dim))
<< "Unexpected distance function chosen for dim " << dim;
ASSERT_NEAR(baseline, arch_opt_func(v1_orig.data(), v2_quantized.data(), dim), 0.01)
<< "SVE2 with dim " << dim;
optimization.sve2 = 0;
}
#endif
#ifdef OPT_SVE
if (optimization.sve) {
unsigned char alignment = 0;
arch_opt_func = Cosine_SQ8_GetDistFunc(dim, &alignment, &optimization);
ASSERT_EQ(arch_opt_func, Choose_SQ8_Cosine_implementation_SVE(dim))
<< "Unexpected distance function chosen for dim " << dim;
ASSERT_NEAR(baseline, arch_opt_func(v1_orig.data(), v2_quantized.data(), dim), 0.01)
<< "SVE with dim " << dim;
optimization.sve = 0;
}
#endif
#ifdef OPT_NEON
if (optimization.asimd) {
unsigned char alignment = 0;
arch_opt_func = Cosine_SQ8_GetDistFunc(dim, &alignment, &optimization);
ASSERT_EQ(arch_opt_func, Choose_SQ8_Cosine_implementation_NEON(dim))
<< "Unexpected distance function chosen for dim " << dim;
ASSERT_NEAR(baseline, arch_opt_func(v1_orig.data(), v2_quantized.data(), dim), 0.01)
<< "NEON with dim " << dim;
optimization.asimd = 0;
}
#endif

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid this (and all the following) change that is not related to sq8_sq8?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR should also handle the fixes needed with regular SQ8, as it was also converted to be used by the disk implementation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to implement them for sure, but in a different pr

…Calculations

- Refactored inner product calculations for SQ8 vectors using NEON and SVE optimizations.
- Integrated UINT8_InnerProductImp for efficient dot product computation in NEON and SVE implementations.
- Updated inner product functions to handle 64-element chunks for improved performance.
- Adjusted distance function selection logic to ensure optimizations are applied only for dimensions >= 16.
- Added tests for zero vectors and constant vectors to validate optimized implementations against baseline results.
- Ensured consistency in assertions for symmetry tests across various optimization flags.
- Improved code readability and maintainability by removing redundant code and comments.
@dor-forer dor-forer requested a review from meiravgri January 4, 2026 12:09
- Updated inner product functions for NEON, SSE4, and SVE to streamline dequantization and reduce unnecessary calculations.
- Consolidated common logic for inner product and cosine calculations across different SIMD implementations.
- Enhanced the handling of vector normalization and quantization in unit tests, ensuring consistency in compressed vector sizes.
- Adjusted benchmark tests to reflect changes in vector compression and distance function calls.
- Corrected include paths for AVX512 implementations to maintain consistency across the codebase.
float SQ8_SQ8_InnerProductImp(const void *pVec1v, const void *pVec2v, size_t dimension) {
// Compute raw dot product using efficient UINT8 AVX512 VNNI implementation
// UINT8_InnerProductImp uses _mm512_dpwssd_epi32 for native integer dot product
int dot_product = UINT8_InnerProductImp<residual>(pVec1v, pVec2v, dimension);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

// NEON SQ8-to-SQ8 functions
#ifdef OPT_NEON
bool neon_supported = opt.asimd;
INITIALIZE_BENCHMARKS_SET_IP(BM_VecSimSpaces_SQ8_SQ8, SQ8_SQ8, NEON, 16, neon_supported);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that the dim_opt is not aligned with bm_spaces_uint.cpp values?

Copy link
Collaborator Author

@dor-forer dor-forer Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just forgot to change it.

@dor-forer dor-forer requested a review from meiravgri January 4, 2026 15:13
meiravgri
meiravgri previously approved these changes Jan 4, 2026
@dor-forer dor-forer added this pull request to the merge queue Jan 4, 2026
Merged via the queue into main with commit b0fd737 Jan 4, 2026
24 checks passed
@dor-forer dor-forer deleted the dorer-sq8-dist-functions-ip-cosine branch January 4, 2026 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants